Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            Ozay, N; Balzano, L; Panagou, D; Abate, A (Ed.)We consider the problem of learning a realization of a partially observed bilinear dynamical system (BLDS) from noisy input-output data. Given a single trajectory of input-output samples, we provide an algorithm and a finite time analysis for learning the system’s Markov-like parameters, from which a balanced realization of the bilinear system can be obtained. The stability of BLDS depends on the sequence of inputs used to excite the system. Moreover, our identification algorithm regresses the outputs to highly correlated, nonlinear, and heavy-tailed covariates. These properties, unique to partially observed bilinear dynamical systems, pose significant challenges to the analysis of our algorithm for learning the unknown dynamics. We address these challenges and provide high probability error bounds on our identification algorithm under a uniform stability assumption. Our analysis provides insights into system theoretic quantities that affect learning accuracy and sample complexity. Lastly, we perform numerical experiments with synthetic data to reinforce these insights.more » « lessFree, publicly-accessible full text available May 22, 2026
- 
            Li, Y; Mandt, S; Agrawal, S; Khan, E (Ed.)We study the problem of representational transfer in offline Reinforcement Learning (RL), where a learner has access to episodic data from a number of source tasks collected a priori, and aims to learn a shared representation to be used in finding a good policy for a target task. Unlike in online RL where the agent interacts with the environment while learning a policy, in the offline setting there cannot be such interactions in either the source tasks or the target task; thus multi-task offline RL can suffer from incomplete coverage. We propose an algorithm to compute pointwise uncertainty measures for the learnt representation in low-rank MDPs, and establish a data-dependent upper bound for the suboptimality of the learnt policy for the target task. Our algorithm leverages the collective exploration done by source tasks to mitigate poor coverage at some points by a few tasks, thus overcoming the limitation of needing uniformly good coverage for a meaningful transfer by existing offline algorithms. We complement our theoretical results with empirical evaluation on a rich-observation MDP which requires many samples for complete coverage. Our findings illustrate the benefits of penalizing and quantifying the uncertainty in the learnt representation.more » « lessFree, publicly-accessible full text available April 23, 2026
- 
            The rise of foundation models fine-tuned on human feedback from potentially untrusted users has increased the risk of adversarial data poisoning, necessitating the study of robustness of learning algorithms against such attacks. Existing research on provable certified robustness against data poisoning attacks primarily focuses on certifying robustness for static adversaries who modify a fraction of the dataset used to train the model before the training algorithm is applied. In practice, particularly when learning from human feedback in an online sense, adversaries can observe and react to the learning process and inject poisoned samples that optimize adversarial objectives better than when they are restricted to poisoning a static dataset once, before the learning algorithm is applied. Indeed, it has been shown in prior work that online dynamic adversaries can be significantly more powerful than static ones. We present a novel framework for computing certified bounds on the impact of dynamic poisoning, and use these certificates to design robust learning algorithms. We give an illustration of the framework for the mean estimation problem and binary classification problems and outline directions for extending this in further work.more » « lessFree, publicly-accessible full text available April 23, 2026
- 
            We study the problem of online resource allocation, where customers arrive sequentially, and the seller must irrevocably allocate resources to each incoming customer while also facing a prespecified procurement cost function over the total allocation. The objective is to maximize the reward obtained from fulfilling the customers’ requests sans the cumulative procurement cost. We analyze the competitive ratio of a primal-dual algorithm in this setting and develop an optimization framework for designing a surrogate function for the procurement cost to be used by the algorithm to improve the competitive ratio of the primal-dual algorithm. We use the optimal surrogate function for polynomial procurement cost functions to improve on previous bounds. For general procurement cost functions, our design method uses quasiconvex optimization to find optimal design parameters. We then implement the design techniques and show the improved performance of the algorithm in numerical examples. Finally, we extend the analysis by devising a posted pricing mechanism in which the algorithm does not require the customers’ preferences to be revealed. Funding: M. Fazel’s work was supported in part by the National Science Foundation [Awards 2023166, 2007036, and 1740551]. Supplemental Material: The online appendix is available at https://doi.org/10.1287/ijoo.2021.0012 .more » « lessFree, publicly-accessible full text available December 23, 2025
- 
            Free, publicly-accessible full text available November 4, 2025
- 
            In multiplayer games, self-interested behavior among the players can harm the social welfare. Tax mechanisms are a common method to alleviate this issue and induce socially optimal behavior. In this work, we take the initial step of learning the optimal tax that can maximize social welfare with limited feedback in congestion games. We propose a new type of feedback named equilibrium feedback, where the tax designer can only observe the Nash equilibrium after deploying a tax plan. Existing algorithms are not applicable due to the exponentially large tax function space, nonexistence of the gradient, and nonconvexity of the objective. To tackle these challenges, we design a computationally efficient algorithm that leverages several novel components: (1) a piece-wise linear tax to approximate the optimal tax; (2) extra linear terms to guarantee a strongly convex potential function; (3) an efficient subroutine to find the exploratory tax that can provide critical information about the game. The algorithm can find an \eps-optimal tax with O(\beta F^2/eps^2) sample complexity, where \beta is the smoothness of the cost function and F is the number of facilities.more » « less
- 
            We study the gradient Expectation-Maximization (EM) algorithm for Gaussian Mixture Models (GMM) in the over-parameterized setting, where a general GMM with n > 1 components learns from data that are generated by a single ground truth Gaussian distribution. While results for the special case of 2-Gaussian mixtures are well-known, a general global convergence analysis for arbitrary n remains unresolved and faces several new technical barriers since the convergence becomes sub-linear and non-monotonic. To address these challenges, we construct a novel likelihood-based convergence analysis framework and rigorously prove that gradient EM converges globally with a sublinear rate O(1/\sqrt{t}). This is the first global convergence result for Gaussian mixtures with more than 2 components. The sublinear convergence rate is due to the algorithmic nature of learning over- parameterized GMM with gradient EM. We also identify a new emerging technical challenge for learning general over-parameterized GMM: the existence of bad local regions that can trap gradient EM for an exponential number of steps.more » « less
- 
            Free, publicly-accessible full text available December 1, 2025
- 
            Globerson, A; Mackey, L; Belgrave, D; Fan, A; Paquet, U; Tomczak, J; Zhang, C (Ed.)This paper investigates ML systems serving a group of users, with multiple models/services, each aimed at specializing to a sub-group of users. We consider settings where upon deploying a set of services, users choose the one minimizing their personal losses and the learner iteratively learns by interacting with diverse users. Prior research shows that the outcomes of learning dynamics, which comprise both the services' adjustments and users' service selections, hinge significantly on the initial conditions. However, finding good initial conditions faces two main challenges:(i)\emph {Bandit feedback:} Typically, data on user preferences are not available before deploying services and observing user behavior;(ii)\emph {Suboptimal local solutions:} The total loss landscape (ie, the sum of loss functions across all users and services) is not convex and gradient-based algorithms can get stuck in poor local minima. We address these challenges with a randomized algorithm to adaptively select a minimal set of users for data collection in order to initialize a set of services. Under mild assumptions on the loss functions, we prove that our initialization leads to a total loss within a factor of the\textit {globally optimal total loss, with complete user preference data}, and this factor scales logarithmically in the number of services. This result is a generalization of the well-known k-means++ guarantee to a broad problem class which is also of independent interest. The theory is complemented by experiments on real as well as semi-synthetic datasets.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                     Full Text Available
                                                Full Text Available